Knowledge Base Compilation and the Language Design Game

نویسنده

  • Warren Sack
چکیده

The ProgramCritic is a system designed to analyze and critique students' computer programs. After analyzing a program, the ProgramCritic provides the student with a list of English-language comments detailing the strengths and weaknesses of the student's program. The foundation of the ProgramCritic's analytic abilities is a set of "knowledge bases" which describe a range of programming problems and the ways in which parts of the problem can be solved. Several other systems have been built by other researchers with a functionality similar to the ProgramCritic's; notable among them is PROUST [Johnson 1986]. Differences between the ProgramCritic and PROUST are described through a detailed explanation of how one might build a compiler for PROUST's knowledge base language. Shortcomings with PROUST's knowledge base language are pointed out one-by-one and for each shortcoming a fix is proposed. Integrating the proposed fixes together serves to explain the knowledge base language used in the ProgramCritic. Introduction: Automatic Program Debuggers and Language Games Recently there has been an explosion in new paradigms for research in computers and education. Among others, these have been advanced as "new" frameworks: "situated learning", "constructionism", "discovery learning", and "interactive learning environments." Perhaps however "paradigm" is too strong a term to describe the banners under which these small subgroups of the computer and education community function. In fact, the manner in which the recent "paradigms" have been introduced has more closely resembled a political process whereby manifestos are drafted and dissenters are excommunicated. These sorts of rhetoric games are visible in most disciplines -even most sciences (see [Latour 1987] for some interesting examples) -and it is not with distaste that I point them out now. Quite the contrary. I rather enjoy the lively debate the manifestos engender and will soon co-author one on constructivist technologies for education [Sack, Soloway, Weingrad and Guzdial (in preparation)]. The historian of science, Thomas Kuhn, might say that the field of computers and education i s in a "pre-paradigm period:" Throughout the pre-paradigm period when there is a multiplicity of competing schools, evidence of progress, except within schools, is very hard to find. This is the period described ... as one during which individuals practice science, but in which the results of their enterprise do not add up to science as we know it. And again, during periods of revolution when the fundamental tenets of a field are once more at issue, doubts are repeatedly expressed about the very possibility of continued progress if one or another of the opposed paradigms is adopted. [Kuhn 1962: 163] In the 1930's the philosopher Ludwig Wittgenstein coined a term which might be of use to us in sorting out the multiple "pre-paradigms" of the computers and education field; a language game was what Wittgenstein eventually called the endeavors of philosophers (of whom he was one). Thus if I refer to the multiple "pre-paradigms" of computers and education as language * Appears in in Intelligent Tutoring Systems, Second International Conference (Lecture Notes in Computer Science) Claude Frasson, Gilles Gauthier, and Gordon McCalla (editors), (Berlin: Spring-Verlag, 1992). games I do so not flippantly, but rather with the utmost respect and with a very solid and serious Wittgensteinian philosophical tradition behind me. This paper will focus on some "progress" that I have made in one of the language games of the field of computers and education. To more exactly explain this "progress" I will compare my design for a recently implemented automatic program debugger for novices, the ProgramCritic [Sack and Bennett (patent pending)], with a system of similar functionality that is considered a milestone in the field of intelligent tutoring systems, PROUST [Johnson 1986]. Automatic program debuggers are, essentially, sophisticated parsers which can analyze a student's computer program, identify strengths and weaknesses in the student's work, and then present to the student a critique of their work in the form of a list of comments (written in a natural language, like English, for example). Good program debuggers can comment, not only on errors of syntax, but, more interestingly, on errors in a student's solution which are specific to the problem that the student is trying to solve. Addressing a student's non-syntactic errors requires that an automatic program debugger be provided with a description of the problem that a student is working on and, also, with a library of ways in which students often solve (or fail to solve) specific parts of the problem. These kinds of machine-readable, problem descriptions that are employed by "intelligent" program debuggers are often referred to as "knowledge bases." In staking out PROUST's importance for the field of computers and education it has been claimed that • PROUST analyzes a program by constructing a model of the student's intentions and their realization. [Johnson 1986: 15]; and, • ... PROUST's explicit reconstruction of the programmer's intentions constitutes an important advance whose ramifications are not limited to computer programming. [Wenger 1986: 251] In showing why my system, the ProgramCritic, constitutes "progress" over PROUST I steer around the language game played by PROUST's designers which would claim some sort of metahermeneutical status for PROUST; i.e., I am not going to discuss the the "interpretive" powers, or the (re)construction of "intentions", by either PROUST or the ProgramCritic. It i s problematic to claim that the ProgramCritic is "better than" PROUST if one will not -as I will not -play the language game of the designers of the precedent system. After all, how shall I claim any "progress" if I do not claim that the ProgramCritic can "reconstruct intentions" better than PROUST can? Difference does exist between technological artifacts (like automatic program debuggers), but any notion of "progress" is purely fictitious. However, fictions of "progress" are often framed within a recurrent narrative common to a field. I will explain the ProgramCritic's "superiority" over PROUST by referring to an "Occam's Razor" sort of story that is often heard in the field of artificial intelligence. This sort of story often runs like this: X writes a 10,000 line program for a dissertation to illustrate a theory about Q; Q is "proven to be true" by a variety of practitioners (often from related fields like psychology, or linguistics); Y shows how a program of 100 lines (and of equivalent functionality to X's original program) can be be written by throwing out the original theory Q and replacing it with something much more mundane. Let me list a couple of examples of "Occam Razor" like stories that have occurred in the literature of artificial intelligence. SAM [Schank and Riesbeck 1981] was a very complicated program constructed to show how SCRIPTs (i.e., a particular flavor of schemas which have appeared throughout the history of psychology ever since Kant invented them) explained human understanding of stories. SAM was constructed to "understand" stories using SCRIPTs. Now, in several introductory programming books, one can find a "rewrite" of SAM, with functionality equal to the original dissertation work, that is a trivial program one page in length (e.g., [Sterling and Shapiro 1986: 234]). "Explanation-Based Learning"[Minton et al. 1989] has suffered a similar, yet still debated, fate. In their 1988 paper, van Harmelen and Bundy show how the supposedly complex, artificial intelligence technique of explanation-based generalization can be re-explained as plain-old partial evaluation (a technique used by many computer language designers). In van Harmelen and Bundy's re-description "explanations" become "proofs" and partial evaluation i s illustrated by a tiny one-page program. It is instructive to note that the people who are the best at telling "Occam's Razor" sorts of stories in the artificial intelligence literature are often computer programming language designers. By looking at, so-called, "knowledge structures" (like SCRIPTs and "explanations") with the critical eye of the language designer one often finds that "knowledge structures" look more like ordinary computer programs than some sort of mental entities which constitute "psychological reality." For the rest of this paper I will call this sort of critical examination of knowledge representation languages the language design game, because it i s language designers who do it so well. In what follows I play the language design game with entities used in PROUST called "goals, plans, and bugs." According to the designer of PROUST, PROUST uses a "knowledge" of "goals, plans, and bugs" to diagnose a student's "intentions." I will show why one can view "goals, plans, and bugs" as plain-old procedures. By rephrasing them in such a manner, I explain why of the 15,000 lines of Lisp code that it took to construct PROUST most was devoted to, what I would call, an unsuccessful attempt to implement a functionality equivalent to a Prolog interpreter. Using this insight I was able to duplicate PROUST's functionality in my system, the ProgramCritic, with about 100 lines of code. Dispensing with Bug Rules PROUST's knowledge base contains goals, plans and bug rules. Goals and plans were written by [Johnson 1986] to describe correct methods for solving particular parts of a certain programming problem: goals and plans describe idiomatic methods for solving a given programming problem. PROUST's bug rules were designed to describe ways in which the goals and plans might be transformed, or permuted, into incorrect methods. The architecture of PROUST's goals, plans and bug rules appears to have been influenced by the production rule literature (and most especially by the work of Brown, Burton and VanLehn [Brown and Burton 1978; Brown and VanLehn 1980]). Two sets of rules are normally used in production system models of students: correct rules, and incorrect, mal-, or bug-rules. From the perspectives of the production rule game, the construct of bug rules is theoretically important. However, from the perspective of the language design game PROUST's bug rules are a weakness for two reasons: (1) Execution suspension: PROUST suspends the matching of goals and plans if a plan cannot be matched. Bug rules are applied to "explain" the mismatches encountered during the matching of a plan. If the mismatches can be accounted for by some bug rule, then goal and plan matching is resumed. Suspending and resuming execution of PROUST's matching interpreter requires that PROUST do a lot of "bookkeeping." I.e., while it is analyzing a student's Pascal program, PROUST builds an "interpretation tree" (a sort of fancy parse tree) in order to record the state of the matching process. The 'interpretation tree" contains the information necessary to resume a suspended matching process. Keeping a trace of the matching process (like the "interpretation tree") does not necessarily demand a lot of computer space (i.e., memory). For example, most interpreters of the Prolog programming language keep a very space-efficient execution trace to allow backtracking. However, the way in which "interpretation trees" are stored by PROUST is very space demanding. (2) Unclear semantics: Johnson was unable to devise a declarative form for PROUST's bug rules: ... a fully declarative plan-difference [i.e., bug] rule test representation cannot be achieved. [Johnson 1985: 173] As a consequence, bug rules in PROUST are arbitrary Lisp functions. They are arbitrary in the sense that bug rules can be (and are) written to (a) modify the "interpretation tree", i.e., the state of the goal and plan matching process; (b) modify the structure of the student program being analyzed; and, (c) modify the variable bindings of the plans and goals being matched.[Johnson 1986: 180] Consequently, not even the goals and plans in PROUST have a semantics that is independent of the state of the matching interpreter because bug rules in PROUST modify variable bindings and other vital pieces of information necessary to describe the state of the matching process. From a language designer's point of view the alternative is necessary: eliminate bug rules from the architecture and replace them with buggy plan and goal variants written in the same notation as the correct goals and plans. Without bug rules, execution of the matcher does not need to be suspended and one gains a declarative semantics for bugs. MicroPROUST's Goals and Plans After Johnson wrote PROUST he designed a "micro" version of it which he call MicroPROUST [Johnson and Soloway 1985a; Johnson and Soloway 1985b]. The main difference between the goal and plan knowledge structures employed in PROUST and those used in MicroPROUST is that one cannot specify subgoals within a MicroPROUST plan. I will center the following discussion around MicroPROUST plans because they are a bit easier to understand than PROUST's plans. In a later section I will reintroduce subgoals into plans, thus bridging the difference between the plans of PROUST and MicroPROUST. The MicroPROUST plan shown below is designed to match the piece of Pascal code which follows it. (Those readers who have written plans for MicroPROUST will recognize that the plan syntax shown is not exactly the one used in the MicroPROUST described by [Johnson and Soloway 1985b]. However, the syntax shown is "isomorphic" to the original syntax in that one can mechanically translate from one syntax into the other. I have written a short piece of Lisp code to do this translation.)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Forward Chaining Based Game Description Language Compiler

We present a first attempt at the compilation of the Game Description Language (GDL) using a bottom-up strategy. GDL is transformed into a normal form in which rules contain at most two predicates. The rules are then inverted to allow forward chaining and each predicate will turn into a function in the target language. Adding a fact to the knowledge base corresponds to triggering a function cal...

متن کامل

Variations in EFL Teachers’ Pedagogical Knowledge Base as a Function of Their Teaching License Status

The study of teachers’ pedagogical knowledge base (PKB) to discover how teachers think and work is attracting increasing attention in ELT. Against this background, the present study aimed at probing the likely variations in EFL teachers’ pedagogical knowledge base as a function of their teaching license status. To this aim, six teachers (two standard-licensed, two alternatively-licensed, and tw...

متن کامل

A General Framework for Knowledge Compilation

Computational eeciency is a central concern in the design of knowledge representation systems. In order to obtain eecient systems it has been suggested that one should limit the form of the statements in the knowledge base or use an incomplete inference mechanism. The former approach is often too restrictive for practical applications, whereas the latter leads to uncertainty about exactly what ...

متن کامل

Knowledge Base Reformation : Preparing First - Order Theories for E cientPropositional Reasoning

We present an approach to knowledge compilation that transforms a function-free rst-order Horn knowledge base to propositional logic. This form of compilation is important since the most eecient reasoning methods are deened for propositional logic, while knowledge is most conveniently expressed within a rst-order language. To obtain compact propositional representations, we employ techniques fr...

متن کامل

Knowledge Base Reformation: Preparing First-Order Theories for Efficient Propositional Reasoning

We present an approach to knowledge compilation that transforms a function-free first-order Horn knowledge base to propositional logic. This form of compilation is important since the most efficient reasoning methods are defined for propositional logic, while knowledge is most conveniently expressed within a first-order language. To obtain compact propositional representations, we employ techni...

متن کامل

A Turing Game for Commonsense Knowledge Extraction By

Commonsense is of primary interest to AI research since the inception of the field. Traditionally, commonsense knowledge is gathered by using humans to create and insert it in knowledge bases. Automating the collection of commonsense from text that is freely available can reduce the cost and effort of creating large knowledge bases and can enable systems that dynamically adapt to current releva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992